Towards Full Lexical Recognition
نویسندگان
چکیده
Text processing in Serbian is based on the Intex format system of electronic dictionaries. Although lexical recognition is successful for 75% to 90% of word forms (depending on the type of text), some categories of words remain unrecognized. In this paper we present two aspects of e-dictionary enhancement that provide for additional recognition of two important categories of words: named entities and words generally not recorded in traditional dictionaries. We first describe the structure and content of dictionaries of proper names, both personal and geographic, developed to recognize the corresponding classes of named entities. Then we present a set of lexical transducers expressing morphological rules governing word formation, developed for the recognition of unknown words. The resources presented significantly improve the lexical recognition process.
منابع مشابه
Written word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کاملThe production of lexical categories (VP) and functional categories (copula) at the initial stage of child L2 acquisition
This is a longitudinal case study of two Farsi-speaking children learning English: ‘Bernard’ and ‘Melissa’, who were 7;4 and 8;4 at the start of data collection. The research deals with the initial state and further development in the child second language (L2) acquisition of syntax regarding the presence or absence of copula as a functional category, as well as the role and degree of L1 influe...
متن کاملCoordination of word recognition and oculomotor control during reading: the role of implicit lexical decisions.
The coordination of word-recognition and oculomotor processes during reading was evaluated in eye-tracking experiments that examined how word skipping, where a word is not fixated during first-pass reading, is affected by the lexical status of a letter string in the parafovea and ease of recognizing that string. Ease of lexical recognition was manipulated through target-word frequency (Experime...
متن کاملTones of Reduced T1-T4 Mandarin Disyllables
The lexical meaning of Chinese words is determined by syllables and lexical tones. Phonologically, there are four full tones. Empirically, however, it remains a puzzle how tones are recognized when they are reduced in natural speech. This article presents three studies on tones of reduced disyllables: (1) a corpus study on disyllabic reduction, (2) two tone categorical identification experiment...
متن کاملPhonological (un)certainty weights lexical activation
Spoken word recognition involves at least two basic computations. First is matching acoustic input to phonological categories (e.g. /b/, /p/, /d/). Second is activating words consistent with those phonological categories. Here we test the hypothesis that the listener’s probability distribution over lexical items is weighted by the outcome of both computations: uncertainty about phonological dis...
متن کامل